30 research outputs found

    Modeling and predicting temporal patterns of web content changes

    Get PDF
    AbstractThe technologies aimed at Web content discovery, retrieval and management face the compelling need of coping with its highly dynamic nature coupled with complex user interactions. This paper analyzes the temporal patterns of the content changes of three major news websites with the objective of modeling and predicting their dynamics. It has been observed that changes are characterized by a time dependent behavior with large fluctuations and significant differences across hours and days. To explain this behavior, we represent the change patterns as time series. The trend and seasonal components of the observed time series capture the weekly and daily periodicity, whereas the irregular components take into account the remaining fluctuations. Models based on trigonometric polynomials and ARMA components accurately reproduce the dynamics of the empirical change patterns and provide extrapolations into the future to be used for forecasting

    An exploratory analysis of the novelty of a news Web site

    Get PDF
    Abstract The growing amount of information published on the Web, combine

    Stay Awhile and Listen: User Interactions in a Crowdsourced Platform Offering Emotional Support

    Get PDF
    Internet and online-based social systems are rising as the dominant mode of communication in society. However, the public or semi-private environment under which most online communications operate under do not make them suitable channels for speaking with others about personal or emotional problems. This has led to the emergence of online platforms for emotional support offering free, anonymous, and confidential conversations with live listeners. Yet very little is known about the way these platforms are utilized, and if their features and design foster strong user engagement. This paper explores the utilization and the interaction features of hundreds of thousands of users on 7 Cups of Tea, a leading online platform offering online emotional support. It dissects the level of activity of hundreds of thousands of users, the patterns by which they engage in conversation with each other, and uses machine learning methods to find factors promoting engagement. The study may be the first to measure activities and interactions in a large-scale online social system that fosters peer-to-peer emotional support

    Time series analysis of the dynamics of news websites

    Get PDF
    Abstract-The content of news websites changes frequently and rapidly and its relevance tends to decay with time. To be of any value to the users, tools, such as, search engines, have to cope with these evolving websites and detect in a timely manner their changes. In this paper we apply time series analysis to study the properties and the temporal patterns of the change rates of the content of three news websites. Our investigation shows that changes are characterized by large fluctuations with periodic patterns and time dependent behavior. The time series describing the change rate is decomposed into trend, seasonal and irregular components and models of each component are then identified. The trend and seasonal components describe the daily and weekly patterns of the change rates. Trigonometric polynomials best fit these deterministic components, whereas the class of ARMA models represents the irregular component. The resulting models can be used to describe the dynamics of the changes and predict future change rates

    A methodological framework for cloud resource provisioning and scheduling of data parallel applications under uncertainty

    Get PDF
    Data parallel applications are being extensively deployed in cloud environmentsbecause of the possibility of dynamically provisioning storage and computation re-sources. To identify cost-effective solutions that satisfy the desired service levels,resource provisioning and scheduling play a critical role. Nevertheless, the unpre-dictable behavior of cloud performance makes the estimation of the resources actu-ally needed quite complex. In this paper we propose a provisioning and schedulingframework that explicitly tackles uncertainties and performance variability of thecloud infrastructure and of the workload. This framework allows cloud users to es-timate in advance, i.e., prior to the actual execution of the applications, the resourcesettings that cope with uncertainty. We formulate an optimization problem wherethe characteristics not perfectly known or affected by uncertain phenomena arerepresented as random variables modeled by the corresponding probability distri-butions. Provisioning and scheduling decisions \u2013 while optimizing various metrics,such as monetary leasing costs of cloud resources and application execution time \u2013take fully account of uncertainties encountered in cloud environments. To test our framework, we consider data parallel applications characterized by a deadline con-straint and we investigate the impact of their characteristics and of the variabilityof the cloud infrastructure. The experiments show that the resource provisioningand scheduling plans identified by our approach nicely cope with uncertainties andensure that the application deadline is satisfied

    Models of mail server workloads

    No full text
    Electronic mail has become an integral part of our daily lives. With this trend, mail servers have to provide a fast, highly available, reliable and secure service. Hence, workload characterization and performance evaluation of mail servers are to be addressed as primary issues. This paper deals with a detailed characterization of mail server workloads. Our study is based on the analysis of a large set of measurements collected on various mail servers. We analyze SMTP and POP3 requests and we obtain models able to capture and reproduce their behavior and most relevant characteristics. These models represent the basis for the definition of the workload of SPECmail2001, a benchmark currently under development within SPEC to assess the ability of a system to act as a mail server

    Multivariate analysis of Web content changes

    No full text
    News websites are expected to deliver in a timely manner the latest stories as well as their latest developments. Thereby, tools, such as, search engines, need to cope with these rapid and frequent content changes by adjusting their crawling activities accordingly. In this paper we explore and model the properties and temporal behavior of the content changes of three major news websites. The dynamics of the changes is characterized by large fluctuations and significant differences from day to day and from hour to hour. However, a certain degree of similarity in the overall patterns of each website exists. In particular, the application of multivariate analysis techniques allows us to identify groups of days with similar change patterns, thus allowing for the customi
    corecore